-
Notifications
You must be signed in to change notification settings - Fork 414
Somewhat clean up closure pipelines #3881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Somewhat clean up closure pipelines #3881
Conversation
`ChannelManager::force_close_channel_with_peer` takes the peer's node_id as a parameter and then returns it, presumably leftover from before we stored channels indexed by peers. Here we simply drop the return value.
When a channel is "coop" closed prior to it being funded, it is closed immediately. Instead of reporting `ClosureReason::*InitiatedCooperativeClosure` (which would be somewhat misleading), we report `ClosureReason::CounterpartyCoopClosedUnfundedCHannel` when our counterparty does it. However, when we do it, we report `ClosureReason::HolderForceClosed`, which is highly confusing given the user did *not* call a force-closure method. Instead, here, we add a `ClosureReason::LocallyCoopClosedUnfundedChannel` to match the `CounterpartyCoopClosedUnfundedCHannel` variant and use it.
`ClosureReason::HolderForceClosed` says explicitly in its docs "Closure generated from `ChannelManager...`, called by the user." However, we were using it extensively when we fail a channel due to errors handling messages or generating our own responses. Here, we convert these cases to the catch-all `ClosureReason::ProcessingError` which exists to communicate errors including a string that describes what happened. This should substantially improve debug-ability in closure cases by avoiding having to pull up logs.
Here we fix a few cases of swallowing appropriate `ClosureReason`s and expand `ClosureReason::FundingTimedOut` to include when we didn't get a channel funded in time, rather than using `HolderForceClosed`.
In d17e759 we added the ability for users to decide which message to send to peers when we force-close a channel. Here we pipe that message back out through the channel closure event via a new `message` field in `ClosureReason::HolderForceClosed`. This is nice to have but more importantly will be used in a coming commit to simplify the internal interface to the force-closure logic.
👋 Thanks for assigning @tnull as a reviewer! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a first pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to document all these subtle API changes in a pending_changelog
, as users might be leaning on a certain error variant.
lightning/src/events/mod.rs
Outdated
(2, HolderForceClosed) => { (1, broadcasted_latest_txn, option) }, | ||
(2, HolderForceClosed) => { | ||
(1, broadcasted_latest_txn, option), | ||
(3, message, required), // XXX: upgrade to empty string or whatever and document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: unaddressed XXX
. Probably could use a default_value
with "Channel force-closed"
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just a default we use in tests, its not a real default, went ahead and made the default "", and documented it.
lightning/src/ln/monitor_tests.rs
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this commit is touching a lot of test code now. Should the respective files be converted to rustfmt::skip
first, so that you can remove them for the parts that you touch?
/// Some links printed in log lines are included here to check them during build (when run with | ||
/// `cargo doc --document-private-items`): | ||
/// [`super::channelmanager::ChannelManager::force_close_without_broadcasting_txn`] and | ||
/// [`super::channelmanager::ChannelManager::force_close_all_channels_without_broadcasting_txn`]. | ||
#[rustfmt::skip] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanna remove this skip
while you're at it?
@@ -4350,85 +4360,66 @@ where | |||
/// `peer_msg` should be set when we receive a message from a peer, but not set when the | |||
/// user closes, which will be re-exposed as the `ChannelClosed` reason. | |||
#[rustfmt::skip] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how this commit interacts with LSPS2 'client-trusts-LSP' mode and other 'hosted channel' scenarios where we'd delay broadcasting the funding transaction. Probably won't make any material difference as we don't support them well currently anyways, but it's a bit odd that we'd then have no API that would allow the user to avoid broadcasting an FC spend for a funding transaction of which only the user would know that it was never broadcast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I'm not really a fan of avoiding broadcast in such a case. If we aren't confident that the funding tx hasn't been broadcasted, I'd rather broadcast anyway and let the user suffer the rather minor inconvenience of getting a transaction broadcasted which cannot confirm rather than risk bugs leading to funds loss.
Mind putting that PR up as draft already? Would help me gauge whether I should hold off on simple close related refactors until these two PRs are in. |
Removes totally unnecessary `#[inline]`s on `MsgHandleErrInternal` methods (LLVM will decide for us if it makes sense to inline something, explicit `#[inline(always)]` belongs only on incredibly performance-sensitive code and `#[inline]` belongs only on short public methods that should be inlined into downstream crates) and `rustfmt` `MsgHandleErrInternal::from_chan_no_close`.
`ChannelManager` has two public methods to force-close a (single) channel - `force_close_broadcasting_latest_txn` and `force_close_without_broadcasting_txn`. The second exists to allow a user who has restored a stale backup to (sort of) recover their funds by starting, force-closing channels without broadcasting their latest state, and only then reconnecting to peers (which would otherwise cause a panic due to the stale state). To my knowledge, no one has ever implemented this (fairly half-assed) recovery flow, and more importantly its rather substantially broken. `ChannelMonitor` has never tracked any concept of whether a channel is stale, and if we connect blocks towards an HTLC time-out it will immediately broadcast the stale state anyway. Finally, we really shouldn't encourage users to try to "recover" from a stale state by running an otherwise-operational node on the other end. Rather, we should formalize such a recovery flow by having a `ChainMonitor` variant that takes a list of `ChannelMonitor`s and claims available funds, dropping the `ChannelManager` entirely. While work continues towards that goal, having long-broken functionality in `ChannelManager` is also not acceptable, so here we simply remove the "without broadcasting" force-closure version. Fixes lightningdevkit#2875
While most of our closure logic mostly lives in `locked_close_chan`, some important steps happen in `convert_channel_err` and `handle_error` (mostly broadcasting a channel closure `ChannelUpdate` and sending a relevant error message to our peer). While its fine to use `locked_close_chan` directly when we manually write out the relevant extra steps, its nice to avoid it to DRY up some of our closure logic, which we do here.
`Channel`'s new `FundingScope` exists to store both the channel's live funding information as well as any in-flight splices. In order to keep the patches which introduced and began using it simple, it was exposed outside of `channel.rs` to `channelmanager.rs`, breaking the `Channel` abstraction somewhat. Here we remove one case of `FundingScope` being passed around `channelmanager.rs` (in the hopes of eventually keeping it entirely contained within `channel.rs`). Specifically, we remove the `FundingScope` parameter from `locked_close_channel`.
In a previous commit, we removed the ability for users to pick whether we will broadcast a commitment transaction on channel closure. However, that doesn't mean that there is no value in never broadcasting commitment transactions on channel closure. Rather, we use it to avoid broadcasting transactions which we know cannot confirm if the channel's funding transaction was not broadcasted. Here we make this relationship more formal by splitting the force-closure handling logic in `Channel` into the existing `ChannelContext::force_shutdown` as well as a new `ChannelContext::abandon_unfunded_chan`. `ChannelContext::force_shutdown` is the only public method, but it delegates to `abandon_unfunded_chan` based on the channel's state. This has the nice side effect of avoiding commitment transaction broadcasting when a batch open fails to get past the funding stage.
12c0a08
to
e73434f
Compare
@@ -1722,6 +1722,7 @@ pub fn check_closed_broadcast(node: &Node, num_channels: usize, with_error_msg: | |||
dummy_connected = true; | |||
} | |||
let msg_events = node.node.get_and_clear_pending_msg_events(); | |||
eprintln!("{:?}", msg_events); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a debug print statement (eprintln!("{:?}", msg_events);
) in the functional_test_utils.rs file that was likely added during development. This statement will print debug information to stderr during normal test execution, which isn't appropriate for committed code. This debug print should be removed before merging.
Spotted by Diamond
Is this helpful? React 👍 or 👎 to let us know.
Still have a few commits on top here but this ended up getting quite big so I'll do them in a followup. There's a pile of things to DRY things up and make small wins here or there, but the main two commits are:
and